recordExtractor
What to extract from crawled pages and how to format it as Algolia records.
The recordExtractor
action is a function that takes an object as an input parameter and returns a list of records.
Parameters
Specify one or more response parameters in your recordExtractor
to determine what information is returned.
A Cheerio instance with the HTML of the crawled page. For more information, see Extracting data with Cheerio.
The size of the crawled page in bytes.
The external data sources of the current URL.
Each key of this object corresponds to an externalData
object.
For example:
Helpers are functions that help extract content and generate records. This can help simplify your record extractor.
Returns
The record extractor returns an array of records with attributes or an empty array. If it returns an empty array, the page is skipped (isn’t crawled).